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Introduction 

Reusable Launch Vehicles (RLVs) have different mission requirements than the Space Shuttle, 
which is used for benchmark guidance design. Therefore, alternative Tenninal Area Energy 
Management (TAEM) and Approach and Landing (A/L) Guidance schemes can be examined in 
the interest of cost reduction. A neural network based solution for a finite horizon trajectory 
optimization problem is presented in this paper. In this approach the optimal trajectory of the 
vehicle is produced by adaptive critic based neural networks, which were trained off-line to 
maintain a gradual glideslope. 

Problem Formulation and Solution Development 

The point mass equations of motions over a flat Earth are used as the model for the trajectory 
propagation dynamics. The final time for a TAEM & A/L trajectory varies for each set of initial 
conditions. Therefore it is more beneficial to use an independent variable whose final value is 
invariant. Also, the independent variable must be monotonically increasing or decreasing. This 
led to the refonnulation of the vehicle dynamics with altitude as an independent variable. This 
procedure is shown in Equation 1, where the ith state equation, dXi/dx, where X, represents the 
ith state has the following form, / ; is the right hand side of the equations of motion and t 
represents time. This procedure reduces the order of the vehicle system to five because the state 
variable x is represented as the independent variable [5]. The current TAEM and A/L guidance 

&X l _SX l dr_ f, 

dx dr dx V sin/ 

is divided into flight phases. Those flight phases are Acquisition, Heading Alignment Cone 
(HAC), Pre -Final, and A/L. The cost function for the trajectory generation, as shown in Equation 
2, was selected to minimize the controlled parameters, 8C L and 8a, to operate the system near 

J = yX SC, 2 +S<j 2 dh 

' Jh+H L 

C L = C Lss + SC L (2) 

O' = o ss + So 

steady state, SCl ss and Sa ss , during each flight phase. Each flight phase has different goals to 
accomplish; therefore, an optimal trajectory is one in which each trajectory phase was optimized 
according to its goal. To accomplish this the selected steady state values of the controls vary for 
the different trajectory phases. Table 1 shows the various steady state values for each flight 
phase considered in this formulation [6]. Also, since the flight phases are defined by altitude 
ranges the cost function transitions are simple to implement [7]. 
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Table 1: Cost Function Parameters 


Acquisition 

HAC 

Pre-Final 

A/L 

^Lss 

fir = 0) 

CJss 

fiV-Vdes) 

V 

, /, horizontal 

/ des rji jt) i • 

1 urn Radius 

fW = 0) 

TR = RF + R x y/ + R 2 y/ 2 

V V 

horizontal 

g tan cr 

Terminal 

Constraints 

stateSf=statesHACi 

statesf=states p _f, 

ff=0 

states^statesA/Li 

w=o 

V t sinyf=-2 

hf=xf = Yf =0 


Neural Network Solution Development 

An action network and a critic network comprise the adaptive-critic neural network structure. 
The action network behaves as a controller while the critic network behaves as a supervisory 
network, which criticizes, or evaluates, the outputs of the action network [3]. The system model 
is used with the cost function to find an optimal control, u , such that the cost function is 
minimized. The optimal control problem can be fonnulated in terms of Hamiltonian [2], The 
propagation equations for the Lagrange multipliers, co-states, are subjected to the boundary 
condition and used along with the system model and the control equations derived from the 
optimality condition to generate the optimal control for the system. The neural network method 
of solution development is shown in Figure (1). In this figure the outputs of the “CRITIC” 



Figure 1: Neural Network Solution Development 


block are the co-states the output of the “CONTROL” block is obtained through the optimality 
condition, and finally the “PLANT” block represents the system model [1]. The solution of 
finite-horizon problems with neural networks evolves in two stages: the synthesis of the last 
network and the propagation of the remaining networks [4], 
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Synthesis of Last Network 

The synthesis of the last network is shown in Figure 2 and described below. 



Figure 2: Synthesis of Last Network 


1. Generate various random values of statesN-i, statesN-2, statesN-3 and calculate An, k N -i,u n .i, 
the co-states and optimal control respectively, by solving the discrete algebraic 
propagation and optimality equations. 

2. Train two neural networks, the action network (ANN) and the critic network (CNN). 
The ANN inputs values of statesN-i and outputs un-i,. The CNN also inputs statesN-2 and 
it outputs A,n-i. 

Synthesis of Remaining Networks 

The process of forming the other networks is described below and shown in Figure 3. 


1 . 


Step 1 



ABTP 

A* 


Step 3 



A.*’ 


CNN * 


Step 5 

states „.| 


states* „. 2 

A*’ „ A ^ 

ABTP 

A.*’ „. 2>| 



U*„. 2 
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Step 2 


States„.2 

Training 

* 

CNN 


Step 4 



states n -2^ 

AFTP 

states n-i 
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Step 6 






states „.2 



U*„.2^ 


ANN 




Figure 3: Synthesis of Remaining Networks 

Algebraically back propagate the trajectory (ABTP) with the statesN-i and An-i to get the 
corresponding statesN-i- 


XVII-4 




2. Train the CNN with the statesN-2 and the corresponding An-i. 

3. Expand states^ and use the CNN to obtain the corresponding En-i. 

4. Algebraically forward propagate the trajectory (AFTP) with the expanded statesN-2 and 
the corresponding En-i to obtain the corresponding statesN-i. 

5. ABTP with the new statesN-i to obtain statesN-2, An-2 and un-2- 

6 . Train an ANN with the statesN-2 as input and Un-2 as output. 


Results 

The neural network synthesis as described above was used to generate TAEM and A/L 
trajectories. The trajectories generated used down range as an independent variable, rather than 
altitude, as discussed earlier so that the results can first be validated with an existing result. The 
performance of the neural network generated trajectories is obtained by assuming any state 
variable value, within the trained scope, and using the appropriate SCl, lift coefficient correction 
factor during simulations to generate the trajectories. The down range history from -1000 mof 


IC=random within training scope 



-1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 



Figure 4: Down Range Neural Network Trajectory Histories 

the velocity, the flight path angle, and the altitude for various initial conditions are shown in 
Figure 4. The ranges of these initial conditions are provided in Table 2. These results were 


Table 2: Initial Condition Ranges (from -1000 m Down Range) 


Maximum 

Minimum 

Velocity (m/s) 

157 

156 

Flight Path Angle (deg) 

-28.8 

-29.7 

Altitude (m) 

248 

241 


generated without the use of the CNN as an expander of initial conditions. This was done so that 
this trajectory set could be compared to an existing trajectory to detennine its validity. Figure 5 
demonstrates that from -1000 meters before the runway the REV can meet the final constraints, 
Xf=0, and V|Sinyt=-2 m/s for any initial condition within the predefined range. The down range 
history of lift coefficient, Cl, which is used as the guidance control variable is displayed in. 
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Figure 5: Satisfaction of Final Constraints 

Figure 6 along with the down range history of the lift coefficient components. From this figure it 
is determined that the commanded lift coefficient values are within an acceptable range for a 
RLV 



Figure 6: Control History 


Future Work 

Once the trajectory comparison has been completed, the CNN expansion, as described in Section 
3.2 will be applied. After its successful application, the study will transition into the use of 
altitude as the independent variable and the 6-degrees of freedom trajectory synthesis will 
commence. This study will continue until the TAEM interface is reached with the neural 
network synthesis. The final results of this study will produce a larger range of initial 
conditions, which like trajectories shown in Figure 5, will meet the predetermined final 
constraints. Finally the neural network guidance will be implemented into Marshall Aerospace 
VEhicle Representation In C (MAVERIC) for a final validation on the X-33 aircraft simulation. 

Conclusion 

A neural network approach to developing optimal TAEM trajectories has been presented in this 
study. This approach solves a nonlinear guidance problem without linearizing the model. The 
results of the future work will demonstrate the powerful capability of adaptive critic based neural 
networks to solve this class of problems. 
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